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LONG-TIME BEHAVIOR OF A FINITE VOLUME DISCRETIZATION FOR A 
FOURTH ORDER DIFFUSION EQUATION 


JAN MAAS AND DANIEL MATTHES 


Abstract. We consider a non-standard finite-volume discretization of a strongly non-linear 
fourth order diffusion equation on the d-dimensional cube, for arbitrary d > 1. The scheme 
preserves two important structural properties of the equation: the first is the interpretation as a 
gradient flow in a mass transportation metric, and the second is an intimate relation to a linear 
Fokker-Planck equation. Thanks to these structural properties, the scheme possesses two discrete 
Lyapunov functionals. These functionals approximate the entropy and the Fisher information, re¬ 
spectively, and their dissipation rates converge to the optimal ones in the discrete-to-continuous 
limit. Using the dissipation, we derive estimates on the long-time asymptotics of the discrete 
solutions. Finally, we present results from numerical experiments which indicate that our dis¬ 
cretization is able to capture significant features of the complex original dynamics, even with a 
rather coarse spatial resolution. 


1. Introduction 


1.1. The QDD equation. In this note, we introduce and analyze a particular spatial discretization 
of the following non-linear parabolic equation of fourth order, 


(1) d t u = -V • mV 


A u 

-1- A log u 

u 


+ V • (mVIV), u = u[t\ x) > 0, t > 0, x £ fl := [0, l] c 


subject to variational boundary conditions, see J5| below. The potential W : SI —> R is assumed 
to satisfy certain structural conditions Q and (15); a possible choice is W{x) = \*/2\x — x\ 2 for 
arbitrary A* > 0 and x £ R d 

Equation ([!]), which is referred to as Quantum drift diffusion (QDD) equation or as Derrida- 
Lebowitz-Speer-Spohn (DLSS) equation in the literature, appears, e.g., in semi-conductor modelling 
nsms] and in the analysis of interface motion in spin systems mm- Depending on the context, 
the non-linear term is written in one of several equivalent forms: 


V • 



A log u 



V 2 : (mV 2 logu). 


Existence and qualitative properties of (weak) solutions to have been intensively analyzed in 
the past two decades 0 023 [23 HE Hi [S3]- For instance, it has been proven — see [2D] for the 
most comprehensive result — that the initial boundary value problem for 0&0 possesses a non¬ 
negative and mass preserving global weak solution u : R + x —>• R for all non-negative initial 
conditions Uq £ L * 1 (fl) of finite entropy. By scaling invariance, we may assume without loss of 
generality in the following that the solution u is a time-dependent probability density. 
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Several (semi-)discrete approximations of IT]) have been studied, both analytically and numer¬ 
ically. The schemes presented in 010 H3 EsTCSI inherit some structural properties of ([l]), like 
monotonicity of certain quantities. All of these schemes have in common that they provide non¬ 
negative (semi-)discrete solutions. 

Here we continue in the spirit of [34] , where a discretization was performed on grounds of 0’s 
gradient flow structure with respect to the X 2 -Wasserstein metric, which leads to a scheme that 
simultaneously preserves two essential Lyapunov functionals. From these Lyapunov functionals, 
estimates on the fully discrete solutions were derived and have been used to analyze their long-time 
asymptotics [39] and the discrete-to-continuous limit [34l . 

However, here we do not use the Lagrangian structure behind 0 which was essential in 
[341 ;39J — but define a scheme on grounds of a finite-volume discretization. Our ansatz is moti¬ 
vated by a particular structure-preserving discretization of linear Fokker-Planck equations, which 
has been introduced simultaneously in [SI ED ESI- Using this “Eulerian approach”, we overcome the 
limitation of [341139] to d = 1 space dimension. The similarities with [34j [39] are that we rely on the 
gradient flow formulation of 0, and that we design the discretization in such a way that enforces 
monotonicity of two Lyapunov functionals. We remark that the general idea to preserve simulta¬ 
neous monotonicity of several functionals in the discretization has been used for other equations 
before, like in the context of the formally similar thin film equations, see [221 1231 B21 . 

1.2. Structural properties and long-time asymptotics. Most of the qualitative results for 
([I]) are based on two fundamental structural properties: the first is its gradient flow structure 
with respect to the X 2 -Wasserstein metric [20], and the second is an intimate relation to a certain 
Fokker-Planck equation El- That Fokker-Planck equation has the form 

(2) d s v s = Av s + V • (u s VU) in H, d v {v s /i r) = 0 on <9f2, 

where 7r : ST —>• R + is given by 

(3) n(x) = \e~ v ^ with Z= [ e~ v ( x,) Ax' 

z Jn 

and defines the unique stationary probability density tt for |2]). To establish the connection between 
(]T|) and ([2]), we shall assume henceforth that the respective potentials V and W are related via 

(4) W = | W| 2 - 2AU, 

and that V is A-convex with some positive A, i.e., V 2 U > A > 0. Notice that V(x) = ^\x — x\ 2 is 
an admissible choice, and leads to W(x) = \ 2 \x — x\ 2 — 2dX. 

A direct computation shows that n is a stationary solution to ([l]) as well, provided the boundary 
conditions are chosen appropriately: 

(5) d v (u/ir) = d v + Alogu + W^j = 0 on dfl. 

Another formal computation reveals that ([Tj) and ([ 2 ]) have two Lyapunov functionals in common, 
namely the relative logarithmic entropy T~L and the relative Fisher information X, given by 
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In fact, both ([2]) and ([I]) are gradient flows — for % and for X, respectively - in the L 2 -Wasserstein 
metric. That is, formally, we can write 

(7) d s v = —KyDyT-L and d t u = - K„D U X, 

respectively, where K is the Onsager operator (inverse metric tensor) of the Wasserstein metric, 

K„£ = -V • (uVO. 

The final but most important connection between ([2]) and 0 ) is the following relation between the 
respective potentials of the two gradient flows: 

( 8 ) I{v)=-D V H[K V D V H}. 

That is, the potential of the gradient flow 0 is the dissipation of the entropy T~L along its own 
gradient flow. Despite the fact that the representation ([8]) of I is classical, implications on the 
dynamics of the fourth order equation ([I]) have been drawn only recently in E], see also (1 [Ml[Mi¬ 
ll turns out DUES] that the equilibration behavior of (JT[) is intimately related to the one of 0). 
We summarize the relevant estimates. Thanks to the A-convexity of V, it follows that both H and 
X decay with exponential rate A along solutions v to 0 , 

(9) H(v s ) < r H(v s ')e~ 2X ^ s ~ s ^ and I(v 8 ) < I(v s ')e~ 2X ( s ~ s ^ for all s > s' > 0, 


and that the Fisher information can be estimated just in terms of the initial value of the entropy, 

(10) X(n s ) < 'H{vq)s ~ 1 for all s > 0. 

With V and W related by Q, the following analogous estimate can be shown for solutions to the 
QDD equation Q: 

(11) H(u t ) <H(ut')e~ l ' 2X)2{t - t ' ) and X(it t ) < X(ut/)e" (2A)2(t - t ' ) for alii > t' > 0, 

(12) X(u t ) < 'H(uo)(2Xt )~ 1 for all t > 0. 


We review the derivation of 0 ~([T2| in Section [2~4| 


1.3. Discretization and main result. The leading principle for our spatial discretization of (JT]) is 
that the semi-discrete solutions to that scheme inherit the estimates in (11) and ( fl2| . We discretize 
0 and 0 simultaneously in order to preserve their close relation. 

For the discretization of ([2]) we follow an approach based on the entropy gradient flow structure 
for Markov chains developed in ® EH 132, which has been subsequently applied in HMD ED EH- 
We perform a finite volume discretization with a regular cubic lattice: fix a box length h = 1/N 
with N £ N and consider piecewise constant probability densities u h on the equi-distant subdivision 
of fl in N d sub-cubes of side length h. Now, we replace 0by 

(13) d s v h = -KjjD v U h and d t u h = -K()D u l\ 


respectively, where the discretized entropy T~L h is given (up to an additive constant > 0 defined in 


(46)) by the restriction of %, and the discretized Fisher information I h is obtained by the relation 


0b i-e., 

(14) n h (u h ) = H(u h ) - q' 1 , l h (u h ) = D uh H h [K uh D uh 'H h }. 


The discretized Onsager operator K ,! - which implicitly determines a metric on the piecewise 

constant density functions — is designed such that the gradient flow of T~L h is the forward equation 
for a continuous time Markov chain. The appropriate and rather non-obvious choice for K^ - , see 
(47), was independently found in [3Tj and in ED- 
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For the main result that we formulate below we need an additional hypothesis on the potential 
V, namely that 


(15) 


V{x) = VW( Xl ) + --- + VW(x d ) 


for suitable functions : [0,1] —> R, which, by definition of n in ([3]), is equivalent to the following 
factorization of the steady state: 


(16) 


7 t(x) = 7rW(xi) • • • ir^(xd), where 7r^(a;) = —jyye v 1 


with suitable normalization constants > 0 such that the 7 rW,..., are probability densities 
on [0,1]. Under the discretization, tt is replaced by a particular piecewise constant approximation 
7 r h , which is the unique minimizer of T~L h , see Lemma [l] The approximation n h still factors in the 
same form as above, see (45). 


Theorem 1 . Assume that a pair of potentials V, W satisfying the relation Q and the technical 
hypothesis (151 is given. Assume further that V is X-convex with some A > 0. 


For a given discretization parameter h > 0, define discretized entropy and Fisher information as 


in (14), and a discrete Onsager operator as in (471. Then any solution u h of the discrete gradient 
flow 


d t u h = 


satisfies the following analogues of © and ( |12| ), 

(17) ft h (u?)<ft fc (u£)e- (2Afc)2(t -*' ) and l h (v£) < J h (t$)e -(2Ah)2(t-t,) for all t > t'> 0, 

(18) T h (u\) < n h (u 0 ) (2A h t)~ l for all t > 0. 

Consequently, u^ approaches the equilibrium n h exponentially fast, 


(i9) ik 

Above, X h = A + 0(h 2 ) as h f 0. 


| il(n) <^( M J)e- 2(Al 


1.4. Geodesic convexity vs. convex entropy decay. All of the — continuous and discrete 
equations under consideration here will be gradient flows of geodesically A-convex functionals. 
The proof of our main result Theorem [T] above, however, does not require to use the full power 
of A-convexity. Instead, we work with a weaker property that we call convex decay inequality , see 
(CDI) in Section 2.2 In a nutshell, the difference is that we do not require the Hessian of the 


functional to be larger or equal to A in every direction, but only in the direction of the functional’s 
own gradient, at each given point. 

This weaker form of convexity has been used in numerous places and in various disguises for the 
derivation of equilibration estimates, typically in connection with the Bakry-Emery method, see 
e.g. [32] and references therein. Recently, an adaptation of this convexity concept to Markov chains 
has been developed in [5], There are examples — see Remark [5]— where the modulus of convexity 
improves (slightly) upon relaxation from geodesic convexity to convex decay. 

A key technical ingredient in the proof of our main result is the tensorization property of the 
convex decay inequality. This result is given in Section pT~T[ and might be of independent interest. 
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1.5. Plan of the paper. In Section [2] below, we review the basic results from the general theory 
of gradient flows and the Bakry-Emery method which are relevant for the study of our equation ([Tj) 
and our discretization. Sections [3] and [4] are devoted to discretizations. In Section [3j we analyze the 
properties of a finite-volume discretization for the linear Fokker-Planck equation ([2]) in the spirit 
of EH EZ] ■ In Section |4j we define a “compatible” discretization of the QDD equation and prove 
the main result Theorem [lj We conclude by discretizing in time as well, and perfoming a series 
of numerical experiments in dimension d = 2 to illustrate the (non-)optimality of the theoretical 
decay estimates. 


2. Estimates for A-convex gradient flows 

In this section, we shall mainly collect and rephrase classical and recent results about the large¬ 
time behavior of gradient flows. Throughout this section, we assume smoothness of all appearing 
analytical structures. These smoothness assumptions are justified in the analysis of the discretiza¬ 
tions in Sections m below, provided that one restricts to strictly positive probability densities. 
The application to solutions of the original evolution equation (JTJl , however, are purely formal and 
only serve as a motivation. 

2.1. A-convex gradient flows. Let a smooth Riemannian manifold StII with metric d be given. 
For simplicity we assume that 911 is an open subset of a (finite-dimensional) affine space X. At each 
point u G 911, there is a one-to-one correspondence between the scalar product (■,■)„ on T u 91t and 
the Onsager operator K, ; : T*9Jt -A T u 91t, which is the uniquely determined linear isomorphism 
with 


(K u p, Qu= p[£] for all f G T„91T and p G T*9Jt. 

Note that K u is symmetric, in the sense that 

Pi[K u P 2 \ = (K u pi 1 K u p 2 ) u = (K u p 2l K u pi) u = p 2 [K u pi] for all pi,p 2 G T*9Jt. 

In the application discussed here, the Onsager operator (and not the scalar product) will be the given 
quantity. In fact, in our application, the Onsager operator extends continuously to the boundary 
of 911 in X , while the Riemannian metric degenerates at the boundary. 

The gradient flow of a given smooth potential $ : 911 —► R is then defined as (solution to) the 
differential equation 


( 20 ) 


= F^(u) := -K u D n <I>. 


By smoothness of $, local solutions u : [0,T) —► 911 to (201 exist for any initial condition uq G 911, 
and the only possible obstruction to global existence is that u leaves 9H at time T > 0. 

A central notion in the theory is that of A-convexity of $ (with A G 1), which means that 
Hess <f> > A, where the Hessian is to be understood in the Riemannian structure of the 911, and the 
inequality holds in the sense of quadratic forms: 

Hess„ $[£,£] > A||£||^ for all u G 911 and £ G T„9H. 


An elegant “Eulerian” approach for proving A-convexity has been developed in mm- This ap¬ 
proach has been implemented in [16] on the manifold of probability measures over a finite state 
space. A useful characterization of A-convexity, that does not involve the metric but only the On¬ 
sager operator, has been formulated in [3D]: at each u G 9H, define the bi-linear form M„ on T*9H 
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via 

M„[p,p] = - p[D u Fg,[K u p]] + ^p[B u K[F^(u)]p\. 

Then the functional $ is A-convex if and only if the tensor M satisfies the estimate 
(21) M > AK 


in the sense that M„[p,p] > Xp[K u p] for all u £ 971 and p £ T*97t. Here the differential D„ is to be 
interpreted using the linear structure of the ambient space X: 


D U F$[£] = lim 
£—>-0 


F<s>(u + e£) - F<s>(u) 


g t u an, 


and analogously for D„ K. 


Remark 1. In a smooth Riemannian setting, X-convexity of $ implies X-contractivity for its gra¬ 
dient flow, i.e., 

d(u t ,u ' t ) < e~ xt d(uo, u' 0 ) for all t > 0 and arbitrary solutions u, u' to (20). 

2.2. Estimates on the flow. In the following discussion, we will not require the full strength of 
the A-convexity assumption M > AK from (21). Instead, in our calculations we will only apply 
(21) to the argument D„<1>. The resulting convex decay inequality 

(CDI) M[D„$,D„$] > AD„$[K„D„$] 


is weaker than (21). Since 


= D^$[K„D W $] and ^= M(D t ,$, D.„$) 


— as will be shown in the proof below — (CDI) provides a relation between the first and the second 
derivative of d> along its gradient flow. The inequality (CDI) lies at the heart of the Bakry-Emery 
approach to functional inequalities [2j . In a Markov chain setting, the inequality (CDI) has been 
studied in [S], see also mm- 

Our general hypothesis in the remainder of this section is that (CDI) holds for some A > 0. We 
also assume that d> has a unique global minimizer u G 911, and, without loss of generality, that 
$(u) > $(zt) = 0 for all u £ 971. Clearly, these conditions are satisfied when $ is A-convex, in which 
case u is the only critical point of $. 

The auto-dissipation |c7$| 2 : 971 —» R of $ is defined by 


( 22 ) 


\d*\ 2 {u) =D U $[K U D U $]. 


It follows from our assumptions that |9*l>| 2 (it) = 0. 

Proposition 1 (Gradient flow estimates for <£). Along any solution ( v s ) s >o of the gradient flow 
(20), we have, for arbitrary s> s' >0, 

*(«,) < $(rv)e- 2 ^- s '\ 

|5d>| 2 (u s ) < |c>$| 2 (tv)e- 2A(W) , 


(23) 

(24) 


and further, for arbitrary s > 0, 


(25) 


|5$| 2 (u s ) < $(u 0 ) 


2A 


3 2As 


-1 


< $(u 0 ) S 1 . 
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Moreover, the following functional inequality holds for arbitrary v £ 3 71: 

(26) 2A$(t>) < |<9$| 2 (u). 

These results are classical. We sketch a proof here, which derives the estimates directly from the 
hypothesis (CDI| by elementary calculations. The idea is to use the method of iterated gradients 
from [5j. 


Proof. We start by proving (24). To this end, we estimate the decay in time of 

(27) 


|9$| 2 (u) = D„$[K t ,D„$] = ^K ij (u)5 i $(u)S j $( u ), 


i,3 


J(v s ):=- — \d<t>\ 2 (v s ). 


which is 

(28) 

From the last representation in ( |27| ), we obtain, writing dq = and = djd :j <I> for brevity, 

J{v) = ~Y, ( 2K «(«)*«(«) + ^K,^)®^))^^)^ (v). 

i,j,k 

On the other hand, the definition of M. u yields 

M„[D t ,$,D t ,$] =D„$[D„F*[F»(t;)]] + ^D„$[D,K[F $ (u)]D t ,$] 

= ^$ i (u)5 i Ft 4, (u)F/ , (u) + ®i( v ) d k K ij(v) F k 

i,j i,j,k 

= -J2 Mv)(djK ik (v)$ k (v) + K ik (v)* jk (v))Ff(v) - 

i,j,k 

= Y1 (") ( (w)^fc (V) + K ik (v)Q jk {v))F?(v), 

i,j,k 

where the last identity follows by relabeling the indices. We thus obtain the crucial identity 

(29) J(v) = 2M„[D b $,D„$]. 


Applying the assumption (CDI), we infer that 

(30) J(v) > 2AD„$[K 1) D„$] = 2A|<9$| 2 (u). 

Now apply Gronwall’s lemma to the resulting inequality 

--^|d$| 2 (u s )>2A|d$| 2 (u s ) 

as 


to obtain (24). Next, we verify the functional inequality ([26]). First, observe that 

(31) --^$(v.) = VvM-Mvs)] = |5$| 2 (u s ). 

For any fixed s > s' > 0, this allows to conclude that 
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which implies that 

2A($(tv) - $(«.)) < (1 - e - 2A ^- s '))|a$| 2 (^). 
In the limit s —> oo, we have $(it s ) — > $(zi) = 0, and thus we end up with 

2\$(u a >) < |<9$| 2 (u s /), 


which verifies (26). To prove (23), simply combine (31) with ( |26| ) and apply Gronwall’s lemma again. 
Finally, the estimate (251 is a consequence of the following calculation, using that s |9>I > | 2 (i; s ) is 
a monotone function thanks to J24l): 


0 2A s 


- 1 


2A 


|d$| 2 (u s ) = / e 2 H s -s') d s ' |9$| 2 (u s ) < / |5$| 2 (tv)ds' 


d 

dr 


$(iv) ds' = $(u 0 ) - <F(u s ). 


By non-negativity of d>, we arrive at (25). 


□ 


2.3. Estimates on the flow of the dissipation functional. We continue to assume that (CDI) 
holds with some A > 0. We also assume the normalization <f>(u) = 0, with u being the global 
minimizer. Below, we study another gradient flow, namely the one generated by the dissipation 
T = |<9$| 2 , 

(32) u = F-qi(u) = -K„D U \E'. 

In general, no information is available on the convexity of the flow induced by F<&. Still, the following 
analogue of Proposition [T] holds, thanks to the intimate relation of to the A-convex functional $. 

Proposition 2 (Gradient flow estimates for dt). Along any solution (ut)t>o of the auxiliary gradient 
flow ( |32| ), we have, for arbitrary t > tf > 0, 

(33) $K)<$K')e- (2A)2(t “ t ' ) , 

(34) |3$| 2 (u t ) < 
and further, for arbitrary t > 0, 


(35) 


2A 


l<9$l K) < $ (u 0 ) (2A)2t _ < (2A t) 


This result has been proven in the setting of metric spaces in [33] Section 3]. As for Proposition 
[lj we sketch a proof here which only uses the inequality (CDI) and some elementary calculations. 

Proof. We start by estimating the decay of $(u t ) in time, i.e., 

J («t) ~ 

Observe that, thanks to the symmetry of the Onsager operator, 

J(u) = D U $[-E*(u)] = D„T[K,JA U T] = [K U D U $] = [ - F*(u)], 


nation with the inequality (26), it follows that 

J{u) > 2A'F(u) > (2A) 2 $(u). 


thus the J defined above coincides with the J defined in ( |28[ ). From the inequality ( |30[ ) in combi- 
natic 

( 36 ) 
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Another application of Gronwalhs lemma yields (33). In preparation for the proof of (34), ob¬ 
serve that the Cauchy-Schwarz inequality for the scalar product (•, -) u translates into the following 
inequality for the Onsager operator K„: 

{p[K u q}) 2 < p[K u p] q[K u q] for all p,q£ T*9Jt. 


In combination with the estimate (30), we obtain 

(2A) 2 (D„4>[K„D„4>]) 2 < J{u) 2 = (D^[K U D U $]) 2 < D u tf[K u Dtt] D U $[K U D U $]. 

Division by D„<I>[K u D n < I>] leads to 

(37) I(u) := D u tf [K u D u tf] > (2A) 2 D U $[K U D U $] = (2A) 2 * (u). 

Since I(u t ) = we obtain (|34| ) by yet another application of Gronwall’s lemma. For the 

proof of (35), we use the inequality (1346 and the first inequality from (36). We thus obtain 


,(2A) 2 t _ 1 

(2A) 2 


\d<t>\ 2 (u t ) = f e ( 2 A 2 0-*')| a$ | 2 ( Ut )d t ' < [ \d$\ 2 {u t ,)dt' 
Jo Jo 


< T7 / J(ue)dt' = -— / — 


2A 


2A 


dr 


$(u r ) dt' = 


4 -(m 0 ) - ■F(itt) 

2A 


from which the first inequality in (351 follows since $ is non-negative. The second inequality is 
elementary. □ 


2.4. Application: asymptotics for the Fokker-Planck and QDD equation. To conclude 
our short review on gradient flows with ( |CDI[), we show how the the estimates © ~(|T2|) on the 
long-time asymptotics for solutions to ([2]) and""©, respectively, can be obtained from Propositions 
[T] and [2] above, at least formally. For the rigorous derivation of the stated long-time asymptotics 
by variational methods, we refer the reader to [T| and to [33]. 

We consider the set V + (Q) of strictly positive probability densities u : D —)• R + , endowed with 
the L 2 -Wasserstein metric, as Riemannian manifold 5EH. Tangent and cotangent vectors at u £ SEJl 
are identified with functions £ L 2 (£i) of vanishing mean, their pairing being given by 


P[£] = / p(x)£(x)dx. 

J n 

The definition of the scalar product on the tangent spaces is intricate (it requires the solution of an 
auxiliary elliptic problem), but the associated Onsager operator K„ has an explicit form: 

(38) K u p = —V ■ (uVp). 

In this framework, the Fokker-Planck equation © can be written as the gradient flow of the entropy 
T-L from ©: 

(39) d s v s = A^Vs with A^v = —K„D V H = V • (vV log(v/n)) = Av + V • (vW). 

This representation has been the starting point for the existence proof in the celebrated work [24] . 

Next, by the results of McCann [331, the A-convexity of the potential V implies A-convexity of 
this gradient flow; see also [9j for an alternative proof of this fact using the formalism developed 
above. Proposition [l] immediately yields the convergence properties stated in © as well as the 
regularization estimate 
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We proceed to analyze |l]). To begin with, let us rewrite - by integration by parts — the Fisher 
information I from (|6| with the help of A n introduced in (391: 

(40) I(w) = — / log(w/7r) V • (wV(w/tt)) dx = — / \og(w/ it) A n w &x . 

Jfi J fi 

From this representation, it is immediate to deduce the relation Q between entropy and Fisher 
information, i.e., that 

\dH\ 2 (w) = — D UJ 'H[A w w] =1 (w). 


Next, we use (401 to compute the first variation of I: 

(41) D U I[£\ = - f [(£/u) A+ log(u/7r) A w £] dx = - [ 

Jn Jq 

where A* is the L 2 (dx)-adjoint of A w , that is 


+ A* log(u/7r) 
u 


€dx , 


a; log(u/7r) = a; log u + A*V = A log u- W ' V?i + AV - |VW| 2 . 


We thus obtain 


D„X[£] = - [ 
Jn 


/\ 7 / r 

-1- A log it + 2AV — |VW|* 


£ dx. 


From this and the relation 0 between V and W, it is obvious that 0 can be written as the 
gradient flow of Z: 


d t u t = F x (u t ) with F x (u) = -K„D U Z =-V•(uV 


A u 


+ A log u — W 


Remark 2. For later reference, we point out that in view of (41), the equation 0 can he equiva¬ 
lently written in the form 


(42) 


d t u = K. u 


+ A; log(u/7T) . 


This is the representation which naturally appears after discretization, see ( |74| ) below. 

In combination, this means that Proposition [2] applies to solutions Ut of (jT]). The respective 
estimates (331 and (34) turn into and (|35|) becomes (p~2]) . 


3. Discretization of the Fokker-Planck equation 

3.1. Finite volume discretization. For given N £ N, define the length parameter h := 1/A, and 
introduce the d-dimensional cubic lattice of side length N, 

J h := {l,...,A} d C Z d . 

Multi-indices in J h are denoted by i and j, and we write i O j if i and j are neighbors, i.e., |i— j| = 1. 
Intuitively, each j £ J h labels a subcube 

: = (Kh ~ 1), hji) x • • • x ( h(j d - 1 ),hj d ) C Cl 

of side length h in Cl, and each vector U £ is associated to a function u h £ L°°(Cl) that is 
piecewise constant on each wj: 

u h (x) = Uj for all x £ Uy 
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In this spirit, we refer to 

V+{J h ) := \ U £ Ri h ; h d ^ U- } = 1 
{ i£J h 

as the space of positive probability densities on J h ; indeed, for each U £ V+(J h ), 

( u h {x) dx = h d V"' t/j = 1. 

Ja je.J h 

Both vectors S £ TuV+(J h ) and cotangent vectors P £ T^jV+(J h ) are identified with elements in 


of vanishing mean, 


h d J2 Pi = 0, h d J2 Sj = 0, 


and their pairing is given by 


P[E] = h d J2P^- 

j eJ h 


Next, we introduce a discrete approximation IT £ V+{J h ) of the steady state 7r from ([3|. First, 
define vectors V ^' h ,..., V^’ h £ by 


(43) 


y[k\,h _ f_ 

j h 


rhj 


h(j~ 1) 


V [k] (r) dr, 


and accordingly ..., II^’^ £ V + ({1 ,..., N}) by 

( 44 ) “J - zW’ h 

with the appropriate choice of the normalization constant Z^ ,h > 0. Now, IT £ P + {J h ) itself is 
defined such that it inherits the product structure (161: 


Tj[k],h 1 / 

IF = exp ( — V- ) 


(45) 


tt h _ tt[1] 5 ^ 
j — 31 * ” 3d * 


Since V is smooth, the respective piecewise constant densities n h converge to n uniformly on II as 
hi 0 . 

Lemma 1. The piecewise constant representation n h £ / P_|_(fI) with respective values Ilj 1 on the 
cubes Wj is the unique minimizer of TL on the subspace of piecewise constant densities in V+(£l). 
Moreover, 


(46) 


T h := U{-K h ) = log 


ZW • • • ZM 


ZW’ h ■ ■ ■ ZW’ h ' 

This lemma justifies the definition of the discretized entropy TL h in (14). 


Remark 3. It is easily seen that Z^' h \ Z^ k 1 for each k = 1,..., d as h \ 0. Hence j h \ 0. 
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Proof. If u h is piecewise constant on the boxes ay with respective values t/j, then 
TL{u h ) = f u h log(u h /n)dx=Y, [ Uj (log t/j — log 7r) die 

JQ jg Jh Ju} 

= h ' d E U 1 ( lo S U i +logZ+ hl U] V{x) dx ) 


j &J h 

= h d log (Uj/ n j 

j &J h 


log 


zm ... zW 


ZW' h ■ ■ ■ ZW' 11 ' 


For the last line, we have used the property (15) of V, which yields that 

J V{x)dx=^J (V [ 1 ] (x 1 ) + --- + V [d] (x d ))dx = vjl lh 


■V} 


l ],h 


the property Z = Z I 1 ! • • • Z^ d \ and the definition of Tl h in (|44|)fe(45 1 above. Since both U h , H h £ 
V + (J h ), we may further write 

/ \ y[ i] .. y[ d ] 

«(“'■) = *'E n ?(i + (Ci/nf)[iog(t/j/nf) -1]) + log ^ 

j £J h 

Using that r >—> r(logr — 1) + 1 is strictly convex with minimum zero attained at r = 1, Jensen’s 
inequality implies that 

ZW ■ ■ ■ zM 

)>log z[1] , h .,, zM , h , 

with equality if and only if u h = ir h . □ 

3.2. Discretized Fokker-Planck equation. We implicitly introduce a metric on V+(J h ) by 
means of the Onsager operator : Tf,V+(J h ) —> T jjV+(J h ) with 


(47) 


P[K h u Q]=h d Y / yfiM^(U) 


1<->J 


p i- p i\ (Qi-Q 


for all P,Q £ TfjV+{J h ), where the sum runs over all pairs of neighboring indices i £A j, i.e., over 
all edges of unit length in J h , and Ay (U) is an abbreviation of 


A U(C) = A A a 


with the logarithmic mean A : R + 
(48) 


given by 


A (a, b )= J q a 1 ~ x b x ds = | 


a—b 


if a b, 


log a—log b 

a if a = b. 


It has been shown in EH that K. h induces a distance on V + (J h ), which extends to the closure V(J h ) 
of merely non-negative probability densities. The resulting metric space is geodesic and complete. 


Remark 4. The definition (47) of the discrete Onsager operator above is consistent with that of 
the Onsager operator for the Lr-Wasserstein metric on V+(Tt) from (38). To see this relation, let a 
smooth and positive density u £ V+(£l) and two smooth functions p, q : fl —> K. be given. For h \. 0, 
let U h £ V+(J h ) and P h ,Q h £ T * h V+(J h ) be approximations of u andp,q in the sense that their 
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piecewise constant interpolations u h ,p h ,q h £ L°°(Q) converge to the respective u,p,q uniformly. 
Further, for each j £ J h , we introduce the center of the jth cube, 

*• := (/i(ii -1/2),...,Mid -1/ 2 ))- 

Since the values Iff and Uf at neighboring sites if> j are 0{h)-close to each other, the logarithmic, 
geometric and arithmetic mean of Ui/H[ l and L/j/nj 1 are 0(h)-close to each other as well. Hence, 
we have 


j - /- Tjh jjf j - 1 

^ ^ + 0 (h) = y/u^ + o(h) = ~(u{. 4) + «(*]*)) + o(h) 


inside the definition (47). Further, due to the square grid combinatorics of J h , 

0 (h), 


ph _ ph 

i u j = (i — j) • Vp 


+ xj 1 


and similarly for the difference quotients of Q h . Working out the combinatorics, one obtains from 
the definition of the discretize Onsager operator in (47) the following integral approximation: 

P[KuQ] 

_ h d ^ 

~ 9 ~ 2 -^ 


j eJ h 


(u(x j 1 ) + 0(h)) ^2 ' Vp(xj 1 + h/2e k ) + 0{h )) (e fc • \7q{x j 1 + h/2e k ) + 0(/i)) 

fe=i 

= / u(a;) Vp(x) • S7q{x) dx + 0(h). 

J n 


The last expression is an approximation of the original Onsager operator from (38). 


As announced in (14), the entropy functional TL h on V+{J h ) is defined by restriction of the 
original entropy "H, 


H h {U) = h d J2u) log(J7j/nj h ) - 1 h = H(u h ) - 1 h , 




where 7 ^ defined in (46) is such that the convex functional l~L h (U) is non-negative for all U G 


V+(J h ), and vanishes precisely for U = II ,l given in (45). Accordingly, introduce the discretization 
of the Fokker-Planck operator on V+{J h ) by 


(49) 


l h U := -K^D v n h . 


The representation as a linear operator is justified by the following. 

Lemma 2. The discrete Fokker-Planck operator fA h is linear on the simplex V+(J h ): 

(M h U)i = J2 Mfjt/j for all U £ V+(J h ), 
j eJ h 
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with the matrix elements of I 


(50) 


P J ft x J n 


being given by 

' h~ 2 yj 'nf /nj 1 

ift = -h ~ 2 E V n ^ /n J 
o 


l'll'-O-J 


'<fi ^ j, 
h if i=J, 

otherwise. 


Moreover, the adjoint operator C h = given by (C h ij})i = JT eJh Mj-^j is the generator of an 

irreducible and reversible Markov chain on J h with invariant distribution Tl h . 

Proof. First observe that, at each U £ V+(J h ), 

D uH h [E] = h d E (! + log(t/ j /n' 1 ))S j for all S £ TuV+(J h ). 
j eJ h 

Thus, by definition of A, we have for each P £ TyP + ( J h ): 


P[ - K^D l rH h ] = 

i-Hj 

= ^ d E 


/ iog(i7i/n?) - fog(t/j/ni ! 





p-p 




= h‘ 


d -2 



This shows the linearity of the operator in (49), and yields the representation (50). 

being the adjoint generator of a Markov chain means that all of its off-diagonal entries 
are non-negative, and that the column sums vanish. Both properties are immediately verified by 
inspection of (50). Irreducibility means that for any two indices i*, i* , one finds a chain (i m ) m =o,...,M 
of indices i m £ J h with i 0 = i* and i m = i* such that > 0 for all m = 1,..., M. Since 

My > 0 whenever i ■£>■ j, one may take for (i m ) m =o,...,M any chain with i m _i •£>■ i TO connecting i* 
with i*. For reversibility, we need to verify the detailed balance condition 


(51) 


OT = : 


Ij^nr 


for all i, j £ J . 


This again is an immediate consequence of the representation (|50|). Note that (|5 
the Markov property implies = ifo, i.e. 

distribution. 


together with 


Il h is indeed an (in fact: the unique) invariant 

□ 


3.3. A-contractivity of the Fokker-Planck flow. The goal of this section is to prove: 


Proposition 3. The flow F U h from (49) satisfies (CDI) with 

(52) X h = ^ ^1 -exp (--y A )) = A + °( /l2 )- 

This result appears to be novel for dimensions d > 1, but its proof is obtained by combination of 
two results from the literature. The key observation is that, in view of the factorization property 
(45), the space V+(J h ) carries a natural tensorial structure that is compatible with the evolution 
(49) of the spatially discrete Fokker-Planck equation. More precisely: 
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Lemma 3. For each pair of indices i, j £ J h , 

(53) My = Si 2 j 2 di 3 j 3 • ■ ■ 5i d j d + 

Here each £ R NxN is a tri-diagonal matrix, 

/-ap 


[2],hr 

*2J2 Ul 333 


' fiidid + ' 


fiiljl ^232^333 


„\d],h 
“~idjd ’ 


(54) 


q[k] ,h 


a 


[/c] ,/l 


—<7. 


[k],h 




[fc] ,h 


a. 


[k] ,h 


—a 


[k],h 


\ 


[k],h 

°V-1 

\k\,h 
X N -1 


o[k\,h 

Pn- 1 

[k\,h 

-°v 


and the entries a^’ h , j3^’ h , a^’ h are given by 

(55) af ] ’ h = h ~ 2 ^IL [ ^/Hf ] ’ h = h~ 2 exp Q {v} k] l ’ h - V${ h )\ , 

(56) p\ k] ’ h = h~ 2 = h~ 2 exp 0 (v} k] ’ h - VP\ h ) \ , 

(57) 


\k],h \k],h 

a Y = a i ■ 


3?1? = 


\k],h \k],h 

cr = a 


\k],h o\k],h 

°V =Pn- 1- 


Proof. This follows directly from the representation (50) of M^’s entries. 


□ 


Naturally, there is an associated decomposition of the operator K. h on V+{J h ) C R+ into a sum 
of operators with each acting on the smaller state spaces M? by 

AT-l 

(58) P[Kf’ h Q] =hJ2 

3=1 


n [*].h n W.fc A 


Uj 

yr[k],h ’ Tr[fe],/l 

A L j iA i+i 



for U £ and P,Q £ WL N , recalling the notation II^A introduced in (45). For definiteness, 
select a spatial direction k £ {l,...,d} and introduce accordingly: j\F\A c J h as the set of 
indices j' with j' k = 0; for each U £ V + (J h ) and j 7 £ jW’ h the projection such that 

(Uy^)j = 1 ,j,j' k+1 ,...,j' d )) similarly, for P, Q £ T ^V+{J h ) the projections Py\Qy^ £ R w . It 

is then easily verified that 

(59) P[K h u Q}=J2h d - 1 E p y ] [ K uwQ?]- 

k— 1 j f £j[k],h 

Indeed, one only needs to take into account the square-grid structure of J h , and the fact that for 
arbitrary i ca j with j := j k = i k + 1, 




nf ’ nj 1 


= a 


A 


B 


A L i 1L i+1 3 


for all A, B > 0, 


thanks to the factorization (45), and to the properties of the logarithmic mean (48). 
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Lemma 4. For each k € {1,... ,d}, the matrix induces a X h -contractive flow on R N with 

respect to the corresponding Onsager operator KAW. 

Proof. Eventually, we will apply [5J Theorem 3.1], which deals precisely with matrices and 

operators of the forms (54) and (58), respectively. But first, we establish the following 

auxiliary estimate 

h 2 


(60) 


< (i - ^a") n 


T [k\,h 


Indeed, by A-convexity of we have that 

\(yW{x k + h) + VW(x k - h)) > VW(x k ) + ±h 2 . 

Integration of this inequality from x k = (i — 1 )h to x k = ih yields 

which further implies that 

\J exp ( — v}+\ h ) exp(-V}!l\ h ) < exp ^-yA^ exp(-y i [fel ’ h ). 

Recalling the definition (44) of , and the definition (52) of X h . the estimate (60) follows. 
An immediate consequence of (60) is the validy of the monotonicity hypotheses 

(61) 




<a?2r and /3f < ^. 




o[k] 


Therefore, 0 Theorem 3.1] is applicable. It provides the (CDI) for A with respect to the 
Onsager operator KN^, for each 


(62) 


A < A * : = } («f ] - aj+i) + (/f 1 - P\-i)- 


Now, from the definitions (55) and (56) of and fd^ k \ it follows via (60) that 
(63) 


[k],h [k],h 

a) - Qfi+i = 


h~ 2 \J nffiiVnP - h~ 2 y /> h~ 2 (^x'j 


p[k],h _ pik\,h = r 2^ n WA| n WA _ h- 2 \Jn [ ^/uf ] ’ h > hr 2 


1 

which implies that 
(64) 

as desired. 


A* > X h min cosh(y i [fe] ’' 1 - v}^) > X h , 

~ 1=2,...,AT-1 1 i+i /- 


□ 


Proposition [3] follows immediately by combining Lemma [4] with the tensorisation result from 
Theorem [2] below. 

Remark 5. In the setting of Proposition it is possible to prove the stronger property of X- 
convexity (with a slightly worse constant) with a minor modification of the proof. Instead of using 
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[SJ Theorem 3.1] to obtain the inequality (62) as above, one could apply Mielke’s criterion from 
Theorem 5.1] to obtain X h -convexity, with 


(65) 




\»:=2 min X / (af - a&X/jj* 1 - $2). 


Note that the arithmetic mean in (62) is replaced by a geometric mean in (65). It is easily seen that 
X h > X h , but that the difference X h — X h = 0(h 2 ) becomes negligible in the discrete-to-continuous 


limit h f 0. In view of the tensorisation result from ( |66[ ) the result remains valid in any dimension 
with the same constant. 

3.4. Tensorisation of convex entropy decay for Markov chains. In this section, we sketch 
the proof for stability of the inequality (CDI) under tensorization. This result is independent of 
the discretization and might be of interest on its own right. 

We need to fix some notations. First, we recall an alternative representation of a continuous 
time Markov chain on a finite set X: the generator £ : can be written as 

[£if)i = ^2 c it s(ips(i) - ifi) ■ 
seg 

Here, Q is a set of maps from I to X representing the possible jumps, and c,.a > 0 denotes the jump 
rate from i to S(i). For brevity, we shall write V sifi '■= V'a(i) — V’u 

Throughout this section we assume that the following reversibility conditions are satisfied: 

• for every <5 G Q there exists a unique (5 _1 £ Q satisfying <5 —1 (<5(z)) = i for all i with Cj,j > 0; 

• there exists a probability measure 7r = on I such that 7r, > 0 for all i, and 

T: F(i,S)c itS TTi = ^2 -F( <5 ( ?: ); | 5~ 1 ) c i, ( 5 7 b 

i&X,8eg i£l,5£Q 

for all F : X x Q — » K. 

The relative entropy functional : V+(X) —► ffi. is given by 


Un (u) =’22u i \0g(Ui/ni ) . 


iex 

In accordance with the situation described before, we introduce an Onsager operator K such that 
£*(u) = K u T> u 'U„(u): 

P%uQ\ = \ V TTiC i S A lyS (u)\/spfS/ sq.i , where A; s (u) = A( —, -^M. 

2 i^g 

For later reference, let us calculate the Hessian M: it follows from the definition that 

^ ^ v ^ ^ 'R’iCifS-h-i, S 5Pi C5(i),ri^ rjP5(i) Ci,r)^ r]Pi 

i€.T 8,rj^G 

Ci,d(VSPi) "F ^ r)(P j /^5(i) 

iGX 8,r]£G 

= : ^ ^ ^ 
i£T S,r)£G 

where A ™ s (u) = for m = 1,2. See [T51[33] for details. 
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Now consider a collection of Markov chains (I k ,JZ k ,n k ) for k = 1 The corresponding 

product chain (X®,£®,7 t®) is defined by 

X® = I 1 x • • ■ x X N , (£®?/;)i = ^ £V; for if £ L°°(X), tt® = tt 1 <g> • • ■ ® tt n . 

k 


Here it is understood that C k acts on the fc-th coordinate of if. The corresponding Onsager operator 
and Hessian will be denoted by K® and by M®, respectively. To simplify notations we shall write 

n k -.= n n u. 

It has been shown in m that geodesic A-convexity is preserved under tensorisation: 


( 66 ) 


M fc > AfcK fc for all k= 


M® > (nunA fc )K® 


This result is dimension independent, i.e., the bound does not depend on N. The goal of this 
section is to verify that the corresponding tensorisation property also holds for the convex entropy 
decay inequality ((cdTJ). 


Theorem 2 (Tensorisation of convex entropy decay). Suppose that the inequality (GDII holds for 
each k = 1,..., N: 


M k (D u 'H k ,B u 'H k ) > \ k n u H k [K k D u U k ] 


for all u £ V+{l k ) and some X k £ R. Then (|CDI[) also holds for the product chain: 


> (nmiA fc )D u W w [K®D u W B 


for all u £ V+(X®). 


The proof follows along the lines of the proof of (661 in 
we provide some details. 


For the convenience of the reader 


Proof. For each k = 1 ,K, set If = Y\m Tj, and for i £ X®, let if. £ If. be the multi-index 
with the fcth entry i k omitted. For a function p : X® —>• R, define its reduction p *£ : I k —>• M where 
all indices except i k are fixed to if., i.e., p k = p i . Likewise, introduce u l ~ k : I k M + . Finally, set 

TT* = <8>4f^fc 

It follows from the definitions that the Onsager matrix for the product system admits a decom¬ 
position of the form 

N 

(67) p[K®p] = E E ■ 

k-1 ij.6 If. 

Also M® can be split into terms corresponding to the different components: 

N 

M ?[p.p] = E M ^[p>pL where M k ’ e \p,p] = E E 

k,l=i iei® seg k ,rieg e 


and Q k denotes the set of maps associated with the operator C k . It has been shown in m that the 
off-diagonal terms are non-negative (regardless of the A-convexity properties of the components): 

( 68 ) M k /\p,p]>0 iik^i 
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for all u £ P+(X®) and p : X® -A R. The on-diagonal terms satisfy 

(69) M i’ k \p,p}= J2 

Now fix u £ Then 

{p u U^)] k k = log(ui/7Ti) = log(u’*/7rfj - log(7rf.), (D uli H fc ) = log(u’*/7ifj, 
which allows to conclude that 

(D u ^) i *=D ti<Jl W fc -log(4). 

Hence both derivatives coincide up to a constant, whose value is irrelevant, since M u [p,p] depends 
on p only through the values of its discrete derivatives S7 sp. Putting everything together, one 
obtains 


N 




k—1 
N 


^3> forfc E A * E 4 D n -^"[ K ^ D ^^ fc ] f } ( m , in ^) vmk%v u h„], 


k =1 


which is the desired result. 


□ 


4. Discretization of the QDD equation 

In this section, we study the gradient flow of the discretized Fisher information X h , which is 
defined by 

, , , r—r flog{Ui/U^) -log(f7j/nf)\ 2 

(70) X h (U) = D u 'H h [—M h U] = h ‘. 


1«-VJ 


Notice that this definition is in accordance with (141. 

4.1. Existence of the gradient flow. 

Lemma 5. X h is well-defined and non-negative onV+(J h ), withX h (U) = 0 if and only ifU = Il h . 
Moreover, X h has the alternative representation 


(71) 


X\U) = h d ~°- £ ( S - S ) (log S - log c/j 




nf nj' 


ii,'' ° i if' 


for each U £ V+{J h ). Finally, all sublevel sets ofX h are relatively compact in V + (J h ). 
Remark 6. Since the closure ofV+{J h ) in R J is the compact simplex 


V(J h ) = { U £ R J + h ; £ Uj = 1 \ , 

j &J h 
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a subset ofV+(J h ) is relatively compact in V+(J h ) if and only if it is a closed subset ofV{J h ). A 
consequence is that if A C V+(J h ) is relatively compact, then it has a positive distance S > 0 to the 
boundary ofV(J h ), i.e., 

(72) 


inf min U\> 0. 

ueAfejh 


Proof. Well-definedness and non-negativity are obvious from (70). 
G V+(J h ), and since any two indices i,j 


Since Ay([7) > 0 for each 

U G V+(J h ), and since any two indices i,j G J h can be connected by a sequence of neighbors, 
I h (U) = 0 holds if and only if log(t/i/IIj l ) is a constant independent of i. That is, U = all h for a 
global constant a > 0. Now U G V+(J h ) implies a = 1, i.e., U = 7l h . 

The representation 0 follows immediately from the definition ( |48[ ) of the logarithmic mean, 
since a — b = A (a, 6) (log a — log 6). 

It remains to prove the compactness of sublevel sets. By continuity of I h , any sublevel set 
A := {l h )~ 1 ([0,a\) is relatively closed in V+{J h ). In view of Remark [g] above, it remains to 
be verified that A is also closed in V(J h ), i.e., that the closure of A in does not intersect 
V{J h ) \ V+(J h ). Towards a contradiction, assume that a sequence (U n ) ne jq in A is such that 
U n -A U* V+(J h ); we are going to show that I h (U n ) -A- oo. By compactness of V{J h ) in R jh , 
the limit U* lies in the boundary V(J h ) \ V+(J h ). Thus, there is some i* G J h with Uf = 0. On 
the other hand, U* G V(J h ) implies that there is some i* G J h with Uf, >0. Since i* and i* can 
be connected by a sequence of neighbors, there must exist j*, j* G J h with j* ga j s 
a := [/j*, > 0. Since all the terms in the summation in 0 are non-negative, 


and L7.* = 0, 


l h (U n ) > K 


d—2 



UR UR 

log nt' lo § i# 


= h d ~ 2 * 





(UR 

UR \ 

UR 

J 

J* 

l0g ttTT 

In?. 

nf j 

j 

j. / 

j* 


HU 



urt u\ 



=(//) 


By the choices made above, 


while for all sufficiently large n, 


Ut 


Ut 


(/) -5> — 7 - log -A- > —e 

v ; Ilj'. 6 111'. “ 


1 UR ( UR \ 

" “ 25 J; (“ log If)’ 


which obviously diverges to +00 as n -A 00 . 
We calculated the derivative of I h : 


□ 


Bul h [E] = D 2 u n h [-M h U,E}+U u n h [-M h E] = -h 2 J ^ E i ( + log^i/nf) 


Ui 

hJ 

In other words, with a certain abuse of notation, the gradient flow of I h is given by 

fm h u 


(73) 


U = —'K , ljD u I h = K( l 


V^~ 


(M ft ) J io g ([//n h ) 
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The initial value problem for this gradient flow is well-posed. 

Lemma 6. For every initial condition Uq £ V+(J h ), there is a unique differentiable curve U : 
■> V+(J h ) satisfying (73) with 17(0) = Uq. 


Proof. The right-hand side of (73) is obviously smooth in U. By the standard theory of ordinary 
differential equations, there exists a maxial local solution U : [0, T) —> V+(J h ). Here “maximal” 
means that either T = oo, i.e., the local solution is global, or that there is no limit point in V+(J h ) 
of U ( t ) for tfT. We are now going to prove that the second alternative is impossible. 

Indeed, since Uq £ V+(J h ), we have a := T h (U( 0)) < oo by Lemma [HJ Now U being a gradient 
flow implies that T h (U(t)) < a for all t £ [0, T). That is, the curve U lies in the sublevel set 
A := (I ,l ) -1 ([0, a]), which is compact by Lemma [j] The smooth vector field U — K. , f r DjjX h is 
bounded on A 1 and consequently, U is uniformly Lipschitz continuous on [0,T). Therefore, U(t) 
has a limit in A for tfT. □ 

Remark 7. In the obvious way, M /l induces a linear operator /S!f on the subspace of density 
functions u h £ V+(Pl) that are piecewise constant on each sub-cube u> j. In the same spirit, K. h 
induces a compatible Onsager operator ~K. h , 

K h uhP h = K h v P. 

With these notations, the discrete analogue (731 of the QDD equation (jT]) can be written in the 
following way 

'Aiu 


(74) 


d t u = K? 


(A£)*log(u/7r ft ) 


which is a discretized version of (42). 

4.2. Proof of the main theorem. We are finally in the position to prove Theorem [l] i.e., we 


derive the estimates 0, © and ( |T9| for the gradient flow ([74]) of the discrete Fisher information 
functional I h . 

The estimate s (|l7| ) and (18) follow easily by means of Proposition [ 2 ] Indeed, in order to verify 
that Proposi tion|2| applies in our situation, it suffices to observe that the discrete entropy functional 
H h satisfies (CDI), which is a consequence of Proposition [ 3 ] above, and of the fact that T h = \cfH h \ 2 
by definition in (70). For the proof of (191, we combine the first estimate in © with the Csiszar- 
Kullback inequality, see e.g. m, which specializes in the case at hand to 


\u k — TT h \ 


|i(fi) < 2 H h {u h ). 


4.3. Discretization in time. We shall now use our spatial discretization as basis for the im¬ 
plementation of a numerical scheme for approximate solution of (JT]) . More precisely, we apply a 
discretization in time to the ordinary differential equations 174|) , 


dt 


U = F h (U)= K jf jSu , with (S u ) l = 


1 h U)i 
Ui 


it h\T 


) log 



j eJ h 


For discretization in time, an implicit Euler scheme is employed: we replace the function U h : 
[0, T] —>• by a sequence {U^ T ) m with the interpretation that approximates U h {mr), and 

solve 


( 75 ) 


TJ h ,r _ TT h ’ T 

^ = F h {U^ T ) 
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inductively for m = 1,2,.... The (first order) implicit Euler method is the canonical choice here 
since it transfers the decay estimates © from the semi-discrete to the fully discrete level. 


Proposition 4. Assume that a sequence (U ^ T ) m >o satisfies (751. Then the following time-discrete 
variants of © hold: 

(78) 

H h (U% T ) < U h {U h J) (l + (2A h f T )~ {rn ~ m ' ) and l h {U^ T ) < T h (U^f) (l + (2A' t ) 2 r) _(m_m ' ) , 
for all integers m > m! > 0. 

The proof of Proposition [4] is a consequence of the following convexity property. 

Lemma 7. Both TL h and I h are convex on V+(J h ) in the sense of linear interpolation. 

Remark 8. We emphasize that convexity with respect to linear interpolation and geodesic convexity 
with respect to the Onsager operator K ?1 are (almost) unrelated notions. 

Proof of Lemma^ First, recall that <j> : R+ —» R with cf>(s) = slogs has derivatives 

= 1 + log s, 4>"(s) = 1/s. 

Given U € V+(J h ) and 2 £ TjjV+(J h ), we have on the one hand that 

D^[2] 2 = h d Y J d 2 u 3 (f/j log Uj - Uj log flj) 2? =h d J2 Ur 1 ^ > 0, 


and on the other hand that 


B^E} 2 = h d J2 




lh 

n/ 


Up 

n i\ 


log 


i n2 

n! 1 / 




Up 

n j- 


ih 

n/ 


log 


Up 

n f. 


— 2d\j.du 


= h d J2\f U M 

i-oj 


U}. 

nj 1 og u(f 



i^J 


n i l Uj 3 t 


Ui nj 1 nf 


^ nf it'- 


> o. 


Non-negativity of the second derivatives implies convexity. 

Proof of Proposition [/} Apply the derivative of r H h at to (75) to obtain 


—T> u h,T% h 


U h ' T - U 


h,T 


= d v ^n h 


Kjyfc.T D 


> (2A h fU\U^ T ), 


"2 

“j 


□ 


where we have used the estimate (36) and ( |CDI ) with constant X h to obtain the inequality. Fur¬ 
thermore, since TL h is convex by Lemma [7] above, 


U h ’ T - U 


m—1 


> (1 + (2A h f T ) n h {u 


-H h (U^ 1 )>-H h (UU T )-D uh .,n h 

An iteration of this estimate yields the first inequality in 

jpl 

the estimate (371, and the convexity of I h with respect to linear interpolation. 


_J. The proof of the second inequality 

is obtained in an analogous way, now applying the derivative of I h in place of TL h to (75), using 

□ 





















DISCRETIZATION FOR A FOURTH ORDER DIFFUSION EQUATION 


23 



FIGURE 1. Left and middle: initial condition uq from © in normal and in loga¬ 
rithmic scale. Right: values of the corresponding solution on one-dimensional the 
cross-section x 2 = 1/2 at times t = k x 10 -6 for k = 0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0. 


4.4. Numerical experiments. In our experiments, we restrict attention to the two-dimensional 
situation d = 2, i.e., 12 = [0, l] 2 is the unit square. For the potential, we have used V(x) = 
X/2\x — x\ 2 , with x = (1/2,1/2) the center of [0, l] 2 , corresponding to W{x) = A 2 |a; — x\ 2 — 4A. 
Different choices for the convexity parameter A are used in the simulations. In each experiment, a 
spatial resolution of N = 30 grid points in each direction has been used. The time step r > 0 is 
chosen in dependence of A; since we solve the implicit Euler scheme (75) by an undamped Newton 
iteration in each time step, a sufficiently small r is necessary for numerical well-posedness of the 
scheme. 


4.4.1. Illustration of qualitative behavior. For illustration of the complex qualitative behavior of 
solutions u to 0. we report results for a numerical experiment in the unconfined case W = 0, i.e., 
A = 0, for the initial datum 

(77) uq(x) = ( cos 16 7ra;i + cos 16 7ra" 2 ) + 10 -4 , 

where Z = 0.392... is such that uq integrates to one on [0, l] 2 . The initial condition is drawn in 
Figure [l] This is a straight-forward generalization of the one-dimensional example from |3J Figure 
1] to two space dimensions. Notice that uq has a large plateau where its values are very small (order 
1CT 4 ) in comparison to the average value (order one). The sharp flanks at the edge of the plateau 
drive the dynamics and lead to a rather complicated spatial-temporal behavior of the solution, see 
Figure [2] The right of Figure [T] shows a one-dinrensional cross-section of the solution; qualitatively, 
the behavior is in perfect agreement with the one-dimensional simulations from (3]. A time step 
t = 10~ 7 has been used for the numerical solution in order to resolve the process of creation and 
destruction of local minima inside the plateau region, which happens on a time scale of 10 -6 . 

4.4.2. Rates of equilibration. The goal of the following series of experiments is the numerical verifi¬ 
cation of the analytically estimated rates of equilibration. We vary the convexity parameter A and 
apply the numerical scheme to the very regular initial condition 

u 0 (x) = - sin 2 (37ra;i) sin 2 (27rx2) + -(1 + x\ + x 2 ). 
o o 


( 78 ) 
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Figure 2. Behavior of the solution for the initial condition uq from 0 at times 
t = 0.3 x 10~ 6 , t = 1.0 x 10 -6 , and t = 3.0 x 10“ 6 (left to right), plotted in 
logarithmic scale. The transparent plane at u = 10 -4 has been introduced to 
visualize that the solution does not obey a maximum principle. 


The behavior of entropy and Fisher information are monitored for about one thousand time itera¬ 
tions. The qualitative change in density is shown in Figure[4] The corresponding results for entropy 
and Fisher information are collected in Figure [3] 

In agreement with the analytical estimates in both quantities decay with a rate of at least 
(: 2X h ) 2 . In fact, in each experiment we measure a minimal decay rate (2A*) 2 that is strictly larger 
than the analytically predicted rate. Generally, the difference A* — X h is the larger the smaller 
A > 0 is, and becomes negligible for large values A 10. 

This phenomenon is apparently independent of the spatial resolution h > 0. Our conjecture is 
the following. For solutions to the Fokker-Planck equation ([2|, the estimates ([9]) are not sharp: the 
lower bound on the rate of equilibration is given by 2A* with some A* > A. This improvement is 
due to boundary effects: it is neglegible if the steady state n is very concentrated inside f2 (as is the 
case for A 10), but is significant for more equally distributed stationary densities ir (for A < 10 or 


less). Thanks to its intimate relation to the Fokker-Planck equation pi), the fourth order equation 
(Ij) apparently inherits these improved rates, i.e., one can replace (2A) 2 by (2A*) 2 in the estimates 
(11). These improved estimates pass on to the estimates © and p6[ ) on the discretization. 

Our conjecture is strongly supported by the outcome of the experiments. In Figure ([3]), the decay 
rates of entropy and Fisher information are compared to (2A*) 2 , where A* > X h is the smallest 
non-zero eigenvalue of the associated Markov generator M h on J . In all of the experiments that 
have been performed, the numerically measured rate of decay of entropy and Fisher information, 


\ (log H h (U% T ) - log H h (U%lS) and ^ (log l h {U^ T ) - logX h ({7^’j 1 )) , respectively, 


never fall below the value (2A*) 2 . In fact, the numerically measured rates have always been larger 
but appear to tend towards (2A*) 2 as the system approaches equilibrium. This is in accordance 
with the observation from |33] that the equilibration rates are minimized in the linearized regime 
around the steady state. 


Remark 9. It is tempting to turn the above conjecture into a proof, simply using the spectral gap 
A* instead of X as a lower bound on the modulus of geodesic convexity of TL and performing all 
the estimates accordingly. However, to our knowledge, there is no result available which allows to 
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Figure 3. Logarithmic plot of entropy (lower bold curve) and Fisher information 
(upper bold curve) along discrete solutions for the initial condition (781, using 
A = 1, A = 10 and A = 100 (left to right). The dotted lines correspond to multiples 
ofexp(-(2A(() 2 t). 



Figure 4. Snapshots of the discrete solution for the initial condition ( [78| using 
A = 100 at, respectively, t = 10 -5 , t = 3 • 10 -5 and t = 10 -4 (left to right). Note 
the changing scale. 


estimate the modulus of geodesic convexity of a Markov chain 
in the inequality ( CDI) - by its spectral gap from below. 


or the corresponding constant A 
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